Segmentation of Brain Tissues from MRI Images Using Multitask Fuzzy Clustering Algorithm

In recent years, brain magnetic resonance imaging (MRI) image segmentation has drawn considerable attention. MRI image segmentation result provides a basis for medical diagnosis. The segmentation result influences the clinical treatment directly. Nevertheless, MRI images have shortcomings such as noise and the inhomogeneity of grayscale. The performance of traditional segmentation algorithms still needs further improvement. In this paper, we propose a novel brain MRI image segmentation algorithm based on fuzzy C-means (FCM) clustering algorithm to improve the segmentation accuracy. First, we introduce multitask learning strategy into FCM to extract public information among different segmentation tasks. It combines the advantages of the two algorithms. The algorithm enables to utilize both public information among different tasks and individual information within tasks. Then, we design an adaptive task weight learning mechanism, and a weighted multitask fuzzy C-means (WMT-FCM) clustering algorithm is proposed. Under the adaptive task weight learning mechanism, each task obtains the optimal weight and achieves better clustering performance. Simulated MRI images from McConnell BrainWeb have been used to evaluate the proposed algorithm. Experimental results demonstrate that the proposed method provides more accurate and stable segmentation results than its competitors on the MRI images with various noise and intensity inhomogeneity.


Introduction
With the increasing demand for medical services, medical imaging technology continues to be improved. Technology plays a major role in computer-assisted medicine. Tere are multimodal medical imaging technologies, such as magnetic resonance imaging (MRI), positron emission tomography (PET) scanning, and computed tomography (CT) scanning. MRI has the advantages of a high data rate, no radiation, and high soft-tissue contrast [1]. MRI is generally used to visualize the structure and tissue of a patient [2].
With the great number of medical images increasing in number, manual interpretation of image information becomes an impossible challenge. Experts have diferent experiences and knowledge. It is impossible to obtain uniform and precise segmentation results [3]. Computer-assisted medical image processing plays a more and more important role. Image segmentation is an indispensable part of medical image processing [4]. Te diferent tissues classifcation of the image provides a reference for doctors in disease diagnosis and intervention decisions. It helps improve diagnostic accuracy and efciency. Hence, the research has great clinical signifcance in medical image segmentation.
Traditional image segmentation methods are divided into distinct categories according to their principles, such as threshold, clustering, region-based, and edge-based methods [5]. Te fuzzy C-means (FCM) algorithm was frst proposed by Bezdek et al. [6]. Te FCM is widely used owing to its applicability and simplicity [7]. Te FCM algorithm provides the ability to describe the fuzziness of the images. Terefore, the fuzzy clustering algorithm is appropriate for MRI images. Nevertheless, the performance of traditional FCM still needs further improvement [8]. Te core problem is sensitive to noise and the initialization of cluster centroids in brain MRI image segmentation. To solve the problem, many improved FCM algorithms have been proposed. Enhancements have been tried to improve algorithm performance by introducing local spatial information [7,9], integrating bioinspired algorithms [10][11][12], and enhancing the image [13]. Ji et al. [9] introduced a method called RSCFCM for brain MRI image segmentation by introducing a factor for the spatial direction to deal with noise. Te algorithm improved the segmentation accuracy. Meena Prakash et al. [14] employed a brain MRI segmentation algorithm integrated with spatial information and contrast enhancement based on FCM, but the clustering performance of the algorithm is not much improved compared with the original FCM algorithm. Pham et al. [7] integrated the PSO algorithm and kernelized fuzzy entropy clustering with spatial information and bias correction algorithm, called the PSO-KFECSB algorithm, which improved the robustness to noise and initializations, but the computational cost of the algorithm increased. Vinurajkumar and Anandhavelu [13] proposed an enhanced fuzzy segmentation framework for extracting white matter, which exhibited low values of computational time, but the segmentation results are sensitive to the initialization of the fuzzy partition matrix.
MRI images of diferent subjects have much common information. Te related information could improve the segmentation performance. Classical FCM only deals with a single task. Te algorithm pays no attention to the related tasks and only utilizes limited information [15]. To overcome the limitation, many multitask-related algorithms have been proposed, such as transfer-learning clustering algorithms [16], multitask clustering algorithms [15,17,18], multiview clustering algorithms [3,19], collaborative clustering algorithms [20], and subspace clustering algorithms. Multitask learning learns related tasks simultaneously and shares useful information, such as representation and parameters among related tasks. Multitask learning strategy improves the clustering performance and obtains higher accuracy [21]. Hua et al. [3] designed a multiview fuzzy clustering algorithm to extract multiple feature data from the original image. Experimental results prove that the segmentation method optimizes the segmentation efect. Jiang et al. [18] proposed a distributed multitask fuzzy C-means (DMFCM) clustering algorithm for MRI image segmentation, which can extract common and individual information among diferent clustering tasks. Te public cluster centroids represent the common information of diferent tasks. DMFCM signifcantly outperforms traditional FCM. However, because the common information is obtained directly from the original pixel data, the computational complexity greatly increases.
Generally, the current brain MRI image segmentation algorithms sufer from the shortcomings such as the sensitivity to the cluster initialization, lack of robustness to noise, and high computational complexity. In this study, a new fuzzy clustering algorithm is to be explored for better improvement of the aforementioned problems. Based on the traditional FCM algorithm, we integrate multitask learning strategy and propose a weighted multitask fuzzy C-means clustering algorithm (WMT-FCM). WMT-FCM learns multiple diferent but related tasks simultaneously to extract public information. By introducing the adaptive weight learning mechanism, tasks are assigned optimal weights and can be adaptively learned to achieve a better clustering efect.
Te summary of contributions in this research is as follows: (1) We integrate the traditional FCM algorithm and multitask learning strategy with a new objective function and propose an improved fuzzy clustering method to enhance the accuracy of brain MRI image segmentation. (2) We design an adaptive weight learning mechanism to obtain optimal weights for all tasks. Under the weight mechanism, the public information extracted from diferent tasks is more accurate, and each task can achieve better clustering efects. (3) We regard the public cluster centroids as the public information of diferent tasks. Taking into account a large amount of pixel data in the images, we capture public information from cluster centroids instead of raw pixel data. It contributes to reducing computational complexity.

Fuzzy C-Means Algorithm.
In 1965, Zadeh published a paper on fuzzy sets. Tis paper used "Fuzzy" to describe the uncertainty of the classifcation. A membership function was proposed to indicate the fuzzy degree of elements [22]. Compared with classical set theory, elements in fuzzy sets have no strict boundaries. In fuzzy theory, elements are assigned membership values instead of clear categories.
Bezdek et al. introduced fuzzy theory the into hard Cmeans (HCM) algorithm and proposed FCM [6]. Te FCM algorithm divides targets into numerous subcategories according to the uncertainty. Te idea of FCM is to assign each data instance to all clusters with membership values [23,24]. It is an unsupervised fuzzy clustering algorithm with no requirement for human intervention in the implementation of the algorithm [25]. In addition, there is no requirement for setting a threshold in advance. HCM is a hard partitioning method, and the result is either 1 or 0. Compared with HCM, FCM is more suitable for dealing with fuzzy and uncertain problems [26].
Te fuzzy clustering algorithm is widely applied to medical image processing. Militello et al. [27] proposed a semiautomated and interactive approach based on the spatial fuzzy C-means algorithm to segment masses on dynamic contrast-enhanced breast MRI. Al-Saeed et al. [28] proposed a fast-generalized fuzzy C-means algorithm and used the unsupervised algorithm to segment the liver from the rest of the abdomen organs on CT scans. Zhao et al. [29] integrated a deep belief network and FCM unsupervised deep clustering for lung cancer patient classifcation from lung CT images. Militello et al. [30] applied FCM to enhance automatic cell colony detection. Navaei Lavasani et al. [31] used the fuzzy C-means algorithm to segment prostate lesions on prostate dynamic contrast-enhanced MRI and obtained the diagnostic credibility increase. Rundo et al. [32] integrated T1w and T2w MRI image structural information based on the fuzzy C-means algorithm to enhance prostate gland segmentation. Table 1 shows the symbols used in the FCM algorithm. Assume the input dataset with N data instances is is the membership matrix, u ij is the membership value of the ith data sample to the jth cluster, and C(1 < C < N) is the number of subcategories. Te objective function of the FCM algorithm [6] is as follows: (1) Regarding the objective function J FCM , the constant m (m > 1) indicates the degree of ambiguity. When the value of m is larger, the fuzziness of clustering is higher. Terefore, a large value is not conducive to reduce the fuzziness. When the value of m equals 1, it is equivalent to the clustering result of the HCM algorithm. Usually, the value of m is assigned to 2 [2,3,24]. ‖x i − v i ‖ 2 is the Euclidean distance between data sample x i and cluster v j . Te constraints of the membership value are as follows: Te FCM algorithm minimizes the objective function by iteratively calculating the membership degree and cluster centers. Te Lagrange multiplier method is used to solve the objective function. Te cluster centers and membership values can be iteratively updated by the following equations: Considering the uncertainty and unclearness of brain tissue boundaries, the fuzzy clustering algorithm can be employed in image segmentation. Te input dataset X � x 1 , x 2 , . . . . . . x N is the image pixel dataset, where x i represents the grayscale of the ith pixel of the image. Te segmentation of images is transformed into a clustering problem. Tat is, dividing N pixels into C cluster centers according to the fnal membership matrix.
Te steps of segmentation images using the FCM algorithm are summarized as follows: (1) set the iteration stop threshold ε, the number of clusters C, fuzzy index m; (2) initialize the cluster centers randomly; (3) update the membership matrix and cluster centers according to equations (3) and (4); (4). Calculate the objective function; (5) If the objective function value converges, the algorithm stops, otherwise, it goes to step 3).

Multitask Learning Strategy.
Multitask learning refers to performing multiple related tasks at the same time. Multitask learning uses the relationship between these tasks to enhance the clustering performance of a single task [33]. Te defnition of multitask learning is as follows: Assume learning tasks is T � t 1 , t 2 , . . . t T , , all tasks are related but diferent. Multitask learning is aimed at improving the learning performance of each task by using public knowledge [34]. Multitask learning is applied to natural language processing, disease prediction, computer vision, etc. [35]. Te information contained in each task helps other tasks learn better. Because diferent tasks usually have diferent noises, learning together will ofset some noises to some extent. Multitask learning strategy has better generalization performance than single-task learning. In addition, it has better performance and robustness.

WMT-FCM.
FCM is a fuzzy clustering method based on the objective function. Essentially, solving the objective function is an iterative optimization process. Terefore, the algorithm is easily afected by noise and random initialization of cluster centers and falls into a local optimum. In the clustering process, diferent MRI images have very similar cluster centers. Te cluster centroids represent related information of diferent tasks. Tis related information helps to converge the objective function and avoids the negative efect of noise in MRI images [18]. It benefts the improvement of cluster analysis. However, the traditional single-task FCM is only suitable for a single-task scenario and exploits limited information. It cannot mine public information between diferent tasks. Multitask technology has the advantage of mining public information contained in multiple tasks. To utilize the public information and improve the segmentation performance, we introduce multitask technology into the traditional FCM algorithm. Multitask clustering algorithm enables the collaborative learning of diferent tasks in the clustering process. It makes maximum use of the data information of each task. Because diferent segmentation tasks usually have diferent noises, we cannot directly assign the same weight to each task. Te task with a better clustering efect should give a higher contribution to the public information. Terefore, a weighted multitask fuzzy C-means (WMT-FCM) algorithm with adaptive adjustment capability is proposed in this paper. Figure 1 is the schematic diagram of the WMT-FCM algorithm. Assuming a dataset contains T tasks, and each task has N t pixels. Te objective function is proposed as follows: Te objective function constraints are as follows: Where x i,t is the ith data sample of the tth task, v j,t is the jth private cluster center of the tth task, and Z � z 1 , z 2 , . . . , z D is the public cluster center vector of all tasks. U (t) � [u ij,t ] C t ×N t is the private membership matrix of the tth task. p jd,t represents the membership value of private cluster center v j,t to the dth public cluster center z d . D is the number of public cluster centers. λ is a balance parameter to control the infuence of the public clustering term. c is used to adjust the penalty corresponding to the weights of each task. W (t) � w 1,t , w 2,t , . . . , w D,t is the weight vector of the tth task. w d,t represents the importance of the tth task to the dth public cluster.
Te frst part of the objective function contains Tindependent FCM clustering tasks. Te frst part aims to learn the within-task partition matrix and cluster centers. Te second part aims to learn public information about all tasks. It uses the FCM objective function to learn the public partition matrix and public cluster centers. Te third part is the regularization term. We introduce the Shannon entropy as the regularizer. Te third part aims to identify the optimal weights of each task.
Although there is public information about diferent tasks, the diference also exists between all tasks. For example, each task is afected by varying levels of noise and has diferent clustering efectiveness. Terefore, the infuence of diferent tasks should be adjusted according to the actual situation instead of keeping it consistent. Considering the diference between separate tasks, we introduce the adaptive weight w d,t . w d,t controls the impact of the tth task on the dth public cluster centers. If the relationship is clearer between the private cluster centers and the public clustering centers, a higher weight value is given. Tat means the task has a greater contribution to the public cluster centers. Conversely, if the relationship is fuzzier with public cluster centers, a lower weight parameter is given. Te algorithm can utilize the efective public information of diferent tasks to the greatest extent and improve the clustering performance through adaptive weight adjustment.

Optimization.
Te Lagrange multiplier method is used to obtain the minimization of equation (5). According to the corresponding constraints, the objective Lagrangian function is defned as follows: where a i,t , b j,t , and c d are the Lagrange multipliers corresponding to the constraints (∀i ∈ 1, 2, . . . , N t , ∀j ∈ 1, 2, . . . ,

Optimizing Membership Matrix.
Taking the derivative of J WMT− FCM with respect to u ij,t and setting it to zero, we obtain From equation (8), u ij,t is calculated as follows: According to C t l�1 u il,t � 1 and equation (9), a i,t can be obtained as follows after the necessary calculations: By substituting equation (10) into equation (9), the iterative formulate of private membership value u ij,t for tth task is as follows: Similarly, the updating equation of the membership value p jd,t is as follows:

Optimizing Cluster Centroid.
Taking the derivative of J WMT− FCM with respect to v j,t and setting it to zero, we obtain According to equation (13), private clustering centroid v j,t is obtained as following after necessary calculations: Similarly, the updating equation of the public clustering centroid z d is as follows:

Optimizing Weight.
To derive the optimal weights, taking the derivative of the Lagrangian function with respect to w d,t and setting it to zero as follows:

Journal of Healthcare Engineering
According to T t�1 w d,t � 1 and equation (16), we can obtain the optimal weight w d,t using steps similar to optimize the membership matrix as follows: Te specifc steps of WMT-FCM are summarized in Algorithm 1.

Te Experimental Dataset.
To demonstrate the improvement of the proposed algorithm, the traditional FCM and DMFCM [18] are selected as comparison algorithms. Te dataset of this study is downloaded from BrainWeb. Te BrainWeb is acquired from the McConnell Brain Imaging Center of the Montreal Neurological Institute, McGill University [36]. Tis database contains a set of realistic MRI data produced by an MRI simulator. Te BrainWeb simulates 3-dimensional data volumes using three sequences (T1, T2, and PD weighted). Te simulated volumes contain a variety of slice thicknesses, noise levels, and intensity nonuniformity (INU) levels. Te ground truth of the cerebral spinal fuid (CSF), the gray matter (GM), the white matter (WM), and the background are available.
Te BrainWeb dataset in our work consists of 9 T1weighted MRI images (slice 90) with 181217 pixels. Te MRI images are corrupted with diferent levels of noise and INU. Details are shown in Table 2. Tese images are randomly combined as task groups. Te ground truth images of the brain MRI images are shown in Figure 2.

Parameters
Setting. λ and c of the objective function infuence the cluster centers and weight vectors according to equations (8) and (11). In this study, the optimal parameters of the proposed algorithm are obtained by the grid search strategy. λ and c are set from two grids {20, 40, 60, 80,100, 120} and {0.2, 0.4, 0.6, 0.8, 1, 1.2}, respectively. In addition, all experiments are conducted with the maximum number of iterations K � 100, termination parameter ε � 0.0001, cluster index m � 2.
(1) Dice Similarity Coefcient. DSC measures the similarity between the ground truth and segmentation results. According to equation (12), S 1 represents the segmentation results. S 2 represents the ground truth for a single class. Here the DSC measures the similarity of CSF, GM, WM, and background. Te larger value of DSC indicates the better performance of the algorithm.
(2) Average Dice Similarity Coefcient. Te average Dice similarity coefcient [39] of WM, GM, and CSF is described as equation (13). Considering the nonbrain tissue background, we exclude it during the average DSC (DSC av ) calculating.
(3) Segmentation Accuracy. SA index measures the accuracy of the algorithm. Given in equation (14), where A i is the pixel set of the ith cluster belonging to segmented results, B i is the pixel set belongs to ground truth, and K is the number of clusters. Te closer SA to 1 indicates better segmentation performance. Te evaluation metric is defned as follows: Te metrics are an average of ten repeated experiments since the performance of FCM depends on the random initialization of the cluster centroids. Select 3 images with diferent levels of noise and intensity nonuniformity as a task group randomly. Te inputs are segmented into the following four clusters: background, CSF, GM, and WM. All experiments are conducted on MATLAB 2019a and executed with a PC confgured with a 1.50 GHz CPU and 16G memory Intel Core i7 processor, Windows 10.

Results and Discussion
To verify the stability and the antinoise ability improvement of the WMT-FCM algorithm, this section gives a comparison with classical FCM and DMFCM. Tere are nine MRI images from BrainWeb, as shown in Table 2. Figure 3 shows the original images of 9 simulated MRI brain images (slice 90) as well as the corresponding segmentation results of the WMT-FCM algorithm. Segmented images include the entire image and individual tissues. Figure 2 displays the ground truth images of simulated MRI images (slice 90). Figure 3 qualitatively reveals that the WMT-FCM algorithm could partition diferent tissues. Te segmented images overlap well with the ground truth on all tested images. Table 3 shows the experimental results of the following three diferent algorithms: FCM, DMFCM, and the proposed WMT-FCM. Te results include the mean values of the SA, DSC av of all tissues. Te WMT-FCM segmentation performance is signifcantly better than FCM. Tis shows It confrms that the introduction of multitask learning mechanism mines public information from multiple tasks. Compared with DMFCM, the segmentation efect of the WMT-FCM algorithm in this study is better. Tis demonstrates the efectiveness of the task weight learning mechanism. To archive the results, the single task execution time of FCM, WMT-FCM, and DMFCM is about 1 second, 6 seconds, and 600 seconds. Te execution time of multitask algorithms is higher than classical FCM since the public partition matrix and the public cluster center learning require more time. However, even if the execution time is increased, the execution time of the WMT-FCM algorithm is still much shorter than DMFCM.
To further compare segmentation performance, the Dice similarity coefcient of each tissue is calculated. Te result is presented in Table 4. WMT-FCM provides better results than FCM and DMFCM in WM, GM, and CSF generally. Among the three tissues, the performance of WMT-FCM is the best in WM and relatively poor in GM. However, even though the WMT-FCM segmentation performance is slightly inferior in GM. It is still signifcantly superior to FCM and DMFCM. FCM and DMFCM always fail to distinguish CSF accurately. Figure 4 shows SA and DSC av of the algorithms on the images with 20% INU and diferent noise levels. As the images with a higher noise level, the clustering performance of both clustering algorithms is lower. Te WMT-FCM clustering performance is better than the comparison algorithms, even if each algorithm's performance is declining with the noise level increasing. In addition, with the noise increasing, the WMT-FCM clustering performance decreases less than FCM and DMFCM, which indicates that WMT-FCM is more robust to noise. Figure 5 displays the SA variations in repeated trials on image 3. Te SA of FCM changed dramatically in diferent trials. Since the initialization of the clustering centroids is random, the FCM clustering performance depends on the initial values of cluster centers. Terefore, the results of the fuzzy clusteringbased algorithm are usually inconsistent in repeated trials. However, the segmentation results of WMT-FCM are almost unchanged. In addition, SA values of WMT-FCM are higher than FCM and DMFCM in almost all trials. Repeated trials indicate that WMT-FCM is more robust to the initialization compared with FCM and DMFCM and generates consistent excellent segmentation performance.

Sensitivity to Initialization.
To compare the algorithms' performance more visually, the visual results are shown in Figures 6 and 7. Figure 6 displays the segmented images of the frst to fourth trials on Initialization: set the number of tasks T, the number C t of private centers, the number D of public centers, the termination threshold ε, fuzzy index m, the maximum number K of iterations, and the parameters λ and c. Results: the fnal partition matrix and cluster centers for each task. for t � 1,. . ., T do Randomly initialize cluster centroids and weights for the tth task; end Randomly initialize public clustering centroids; for k � 1,. . ., K do Update u ij,t for each task using equation (11) Update v j,t for each task using equation (14) Update p jd,t using equation (12) Update z d using equation (15) Update w d,t using equation (17) Calculate the ftness J WMT− FCM (k) using equation (5)     Original image Total WM GM CSF Figure 3: Segmentation results of simulated MRI brain images (slice 90) with diferent noises and INU in the WMT-FCM algorithm.     image 3 by FCM, DMFCM, and WMT-FCM. FCM fails to partition the brain tissues in almost all trials, especially CSF, except the third trial. In the frst trial, all tissues are segmented as an individual cluster, background. In the second trial, FCM drops the CSF region and oversegmented GM. FCM shows a relatively good result in the third trial. In the fourth trial, FCM fails to detect CSF as well as GM and oversegmented the WM. DMFCM always tends to lose CSF. However, in the four trials, WMT-FCM ofers an excellent overlap with ground truth consistently. A detailed comparison is shown in Figure 7, which indicates the proposed algorithm has superior segmentation performance than FCM, especially on GM and CSF. It can be concluded that the WMT-FCM algorithm provides better performance consistently and is less sensitive to the random initialization.

Sensitivity to Parameters.
Te experiments explore the sensitivity of the WMT-FCM parameters are conducted. Te crucial parameters, i.e., the balance parameter λ and regularization coefcients c are involved. Figure 8 shows the sensitivity of parameters with respect to the two diferent MRI images (image 4 and image 6). WMT-FCM provides relatively excellent segmentation results when the core parameters are located within the proper interval. Overall, the algorithm performance is slightly sensitive to the parameters, especially on DSC av . Te algorithm is more sensitive to trade-of parameter λ than the corresponding regularization parameter c. Terefore, λ plays a major role in obtaining optimal clustering results. With the λ increasing, the segmentation performance of image 4 (with 1% noise and no INU) shows a decreasing trend. Image 6 (with 9% noise and 20% INU) shows a tendency to rise frst and fall after. Excessive trade-of parameter enhances the infuence of public information on clustering results and leads to undesirable efects. Te performance may decrease, especially on images with a low level of noise and INU. In summary, although the proposed algorithm is slightly sensitive to trade-of parameter λ, it obtains relatively good performance within an appropriate range.

Conclusions
In this paper, we propose a new fuzzy clustering algorithm called WMT-FCM. Multitask learning mechanism is introduced into the FCM algorithm for brain MRI image segmentation. WMT-FCM makes use of both private information in a single task and public information among related tasks. To draw more efective public information from diferent tasks, we design an adaptive task weighting mechanism. We take experiments to validate the proposed algorithm on synthetic MRI images. Te results demonstrate that the proposed algorithm provides more accurate segmentation results than the FCM and DMFCM algorithms. WMT-FCM has the following advantages: (1) WMT-FCM is less sensitive to the initialization of cluster centers; (2) the robustness to noise is improved; (3) WMT-FCM is adaptive to the private clustering efect. Tere are two main limitations in the WMT-FCM. Te algorithm requires more computational time for public information learning and adaptive weights updating. Although the proposed algorithm obtains a relatively good efect when the trade-of parameter λ is in the appropriate range, the performance is slightly sensitive to λ. In later research, we will focus on how to tackle these problems. In conclusion, the algorithm based on FCM and multitask learning signifcantly improves the segmentation performance of brain MRI images. Te main limitations of the FCM algorithm, that is, the sensitivity to initialization and noise have been partially improved.